Multi-dimensional feature merging for open domain question answering

ABSTRACT

Methods/systems receive a question and automatically search sources of data containing passages to produce candidate answers to the question. The searching identifies passages that support each of the candidate answers based on scoring features that indicate whether the candidate answers are correct answers to the question. These methods/systems automatically create a scoring feature-specific matrix for each scoring feature. Each scoring feature-specific matrix has a score field for each different combination of text passage and question term (vector), and each score field holds a score value (vector value) indicating how each different combination of text passage and question term supports the candidate answers as being a correct answer to the question. Next, such methods/systems automatically combine multiple such vectors to produce a combined vector score for each of the candidate answers, and then rank the candidate answers based on the combined scores.

CROSS REFERENCES TO RELATED APPLICATIONS

This application is a Continuation of U.S. application Ser. No.13/679,169 filed Nov. 16, 2012, the entire teachings of which areincorporated herein by reference, in their entirety.

BACKGROUND

The present disclosure relates to automated question answering (QA)systems and methods, and more specifically, to such QA methods andsystems that score evidence sources used to support candidate answers.

Most existing factoid question answering systems adopt search strategiesand scoring methods with the assumption that a short passage exists inthe reference corpus which contains sufficient information to answereach question. This assumption largely holds true for short and focusedfactoid questions such as “When did Hawaii become a state?’” and “Whatstrait separates North America from Asia?’” However, some more complexfactoid questions contain facts encompassing multiple facets of theanswer, which often cannot be found together in a short text passage.Consider the following examples:

(1) Born in 1956, this Swedish tennis player won 6 French Opens & 5straight Wimbledons (A: Björn Borg); and

(2) A small- and medium-vessel vasculitis characterized by allergicrhinitis, asthma, and peripheral eosinophilia (A: Churg-StraussSyndrome).

In both examples, information presented in the question can reasonablybe expected to be in documents that describe the respective answerentities. However, it is quite unlikely that all the information will bepresent in one or two adjacent sentences in the document. Morespecifically, in example (1), birth year and nationality information isin the basic biographic section of documents about Björn Borg, whilestatistics about his tennis record can generally be found in a sectionabout Borg's career. Similarly for example (2), being asmall/medium-vessel vasculitis can generally be found in an initialdefinitional sentence about Churg-Strauss Syndrome, while the rest ofthe question naturally falls under a section that describes typicalsymptoms of the syndrome. As a result, a typical passage retrieved frommost reference corpus would cover only a portion of the facts given inthe question.

These multi-faceted factoid questions present a challenge for existingquestion answering systems which make the aforementioned assumption.Consider the following short passages relevant to the question inexample (2):

(2.1a) Churg-Strauss syndrome is a medium and small vessel autoimmunevasculitis, leading to necrosis.

(2.1b) The Churg-Strauss syndrome is a multisystem disordercharacterized by chronic rhinosinusitis, asthma, and prominentperipheral blood eosinophilia.

(2.2a) Wegener's granulomatosis is an incurable form of vasculitis thataffects small and medium-size blood vessels.

(2.2b) Wegener granulomatosis is a rare multisystem autoimmune diseaseof unknown etiology. Its hallmark features include necrotizinggranulomatous inflammation and pauci-immune vasculitis in small- andmedium-sized blood vessels.

Existing systems evaluate each passage separately against the questionwould view each passage as having a similar degree of support for eitherChurg-Strauss syndrome or Wegener's granulomatosis as the answer to thequestion. However, these systems lose sight that even though eachpassage covers half of the facts in the question, (2.1a) and (2.1b)cover disjoint subsets of the facts, while (2.2a) and (2.2b) address thesame set of facts.

SUMMARY

Exemplary embodiments herein include methods and/or systems that receivea question (comprising question terms) into a user interface of acomputerized device. Such methods/systems automatically search sourcesof data containing passages using a processor of the computerized deviceto produce candidate answers to the question. The searching is based onthe question terms, and the searching identifies passages that supporteach of the candidate answers based on scoring features that indicatewhether the candidate answers are correct answers to the question.

These methods/systems also automatically create a scoringfeature-specific matrix for each scoring feature using the processor.Each scoring feature-specific matrix has a score field for eachdifferent combination of text passage and question term, and each scorefield holds a score value (vector value) indicating how each differentcombination of text passage and question term supports the candidateanswers as being a correct answer to the question. Thus, each differentcombination of text passage and question term makes up a vector having avector value equal to that maintained in the score field. Next, suchmethods/systems automatically combine multiple such vectors using theprocessor to produce a combined vector score for each of the candidateanswers, and then rank the candidate answers based on the combinedscores using the processor. Other exemplary methods/systems hereinsimilarly receive a question comprising question terms into a userinterface of a computerized device. Such methods/systems alsoautomatically search sources of data containing passages (e.g., textpassages) using a processor of the computerized device to producecandidate answers to the question. The searching is based on hypothesesgenerated from the question terms, and the searching identifies sourcesof evidence that support each of the candidate answers based on scoringfeatures that indicate whether the candidate answers are correct answersto the question. Thus, the scoring features define the types of analysisthat are used to determine whether the candidate answers are correct.The results of the search can be transformed by using a transformationfunction that takes an input value and information that provides contextfor the input value, and produces a transformed score value.

Such methods/systems automatically create a scoring feature-specificmatrix for each scoring feature using the processor. Each scoringfeature-specific matrix has a score field for each different combinationof text passage and question term, and each score field holds a scorevalue (vector value) indicating how each different combination ofpassage and question term supports the candidate answers as being acorrect answer to the question (with respect to the correspondingscoring feature of the matrix). The different combinations of passagesand question terms thereby form vectors, and each vector has a vectorvalue equal to the corresponding score value.

The systems/methods herein automatically combine (collapse) the vectorsusing the processor by calculating a statistical measure of the vectors(e.g., a collapsing function, such as calculating a maximum, a minimum,a sum, a mean, a median, and/or a standard-deviation of the vectors) toproduce a collapsed score for each of the question terms. Further, thesystems/methods herein automatically combine the collapsed score foreach question term using the processor to produce a combined score foreach of the candidate answers. Then the systems and methods herein canautomatically rank the candidate answers based on the combined score foreach candidate answer using the processor.

BRIEF DESCRIPTION OF THE DRAWINGS

The embodiments herein will be better understood from the followingdetailed description with reference to the drawings, which are notnecessarily drawn to scale and in which:

FIG. 1 is a table illustrating embodiments herein;

FIG. 2 is a table illustrating embodiments herein;

FIG. 3 is a schematic diagram illustrating operations of embodimentsherein;

FIG. 4 is a table illustrating operations of embodiments herein;

FIG. 5 is a flow diagram illustrating embodiments herein;

FIG. 6 is a schematic diagram of a hardware system according toembodiments herein;

FIG. 7 is a schematic diagram of a deployment system according toembodiments herein;

FIG. 8 is a schematic diagram of an integration system according toembodiments herein;

FIG. 9 is a schematic diagram of an on demand system according toembodiments herein;

FIG. 10 is a schematic diagram of a virtual private network systemaccording to embodiments herein; and

FIG. 11 is a schematic diagram of a virtual private network systemaccording to embodiments herein.

DETAILED DESCRIPTION

As mentioned above, existing question answering systems evaluate eachpassage separately against a question. In view of this, thesystems/methods herein provide the notion of multi-dimensional featuremerger (or MDM features) which allow for passage scoring results to becombined across different dimensions, such as question segments anddifferent passage scoring methods. In this example, MDM features thatcombine results across question segments would capture the broadercoverage of passages (2.1a) and (2.2b) mentioned in the Backgroundsection above, and thus enable the system to recognize Churg-StraussSyndrome as a better answer for the question than Wegener'sgranulomatosis.

The systems and methods herein provide a general-purpose MDMfeature-merging framework that can be adopted in question answeringsystems that evaluate candidate answers by matching candidate-bearingpassages against the question. The systems and methods herein mergepassage scoring results across various dimensions, resulting in 1) newfeatures that are more highly correlated with correct answers than thebase features from which they were derived, 2) significant componentlevel performance improvement and 3) end-to-end performance improvement.

For example, given a question, Q, each of its candidate answers, CA, hasa set of supporting passages. In one type of question-answering system,support of each passage for a candidate answer is quantified. Then amerging strategy is used to combine the support of all passages for aparticular candidate answer. The systems and methods herein provide ageneral framework for merging support from supporting passages.

The methodology of calculating the support of a passage for a candidateanswer is called passage scoring. At an abstract level, a passage scoreris responsible for quantifying how well a passage matches a question.The systems and methods herein represent a question and a passage as anordered set of terms (Q={q₁, q₂, . . . , q_(n)}), and (P={p₁, p₂, . . ., p_(m)}), respectively. Passage scorers align question terms to passageterms and assign a score to how well the terms align. For example, apassage scorer will take as input Q and P and output a vector of scoresthat represents how well the passage matches the question. The systemsand methods herein denote this vector for P as the vector {right arrowover (ƒ)} such that ƒ_(i) is the score of how well one of the passageterms matches the i^(th) term in the question. Note the length of thisvector is fixed per question but may vary across questions.

The systems and methods herein collect all these vectors per question,per candidate answer into a matrix, M. For example, CA₁ may berepresented as a matrix where row i corresponds to the passage scoringvector for passage P_(i). An element of this matrix, m_(i,j) is thescore assigned by one of the passage scorers how well passage P_(i)aligns with the term j in the question Q.

This matrix is of variable dimensions for different candidate answersper question. Number of rows could be different because number ofsupporting passages could be different for various candidate answers perquestion. Since various questions have different number of questionterms, number of columns could be different for candidate answers acrossquestions. Essentially, the dimension of this matrix is different forvarious candidate answers. Therefore, the systems and methods hereincannot capture the distribution of this matrix simply by linearizing thematrix.

The systems and methods herein define a function ƒ: M→R^(N), that mapseach matrix into feature vector of fixed length. This function isdefined as follows:ƒ(M)=<g(M),g(M′)>

where M′ is the transpose of matrix M and g is a function g: M→R^(N/2)that maps a matrix into feature vector of fixed length, defined asfollows:

g(M) include sum(S), avg(S), std(S), max(S), min(S), dim(S),non-zero(S)>

where S is a vector of dimensionality equal to the number of columns inM, and each value in S is computed using some function h(v), applied tothe corresponding column in M, where h(v) may include sum(S), avg(S),std(S), max(S), min(S), dim(S), non-zero(S).

Examples of formulae are given in Table 1 (shown in FIG. 1).

Consider an example shown in FIG. 3: “This large land animal also haslarge ears.” Consider two candidate answers and their supportingpassages:

1. Candidate answer 1: African Elephant

(a) P1.1: The African Elephant is a very large land animal. (b) P1.2:African elephants have large ears.

2. Candidate answer 2: Hippo

(a) P2.1: A hippo is a large land animal. (b) P2.2: Hippos haverelatively small ears.

Table 2 (shown in FIG. 2) abstractly shows how passage scorers assignvalues to specific question terms for specific passages. For example,consider the P1.1 row, which represents how well the passage “This largeland animal also has large ears” supports the answer elephant for thequestion “This large land animal also has large ears.” If the passagescorer is effective, it will give a high score to x₁, x₂ and x₃ (becausethe passage does, indeed, provide strong justification for “elephant”satisfying the requirements of being large land animal). If the passagescorer is effective, it will give a very small score (typically 0) tox₄, x₅, and x₆, because the passage says nothing about elephants havinglarge ears. However, some passage scorers may be mislead by the factthat the term “large” appears twice question and either one could alignto the one occurrence in the passage. Often some passage scorers matchtoo many terms and thus assign credit to terms that do not deserve it,while others match too few and miss important content; and this is whythe systems and methods herein have a diverse collection of scorers andlet the classifier sort out how much to trust each of them.

Using one merging strategy, say SUM-THEN-MAX (in which f(M)=g(M),g(v)=MAX(v), and h(v)=SUM(v)), candidate answer 1, African Elephant,will get assigned a feature value equal toSUM-THEN-MAX{(x₁+x₂+x₃+x₄+x₅+x₆), (x₇+x₈+x₉+x₁₀+x₁₁+x₁₂)}. So eitherpassage P1.1 or passage P1.2 will be selected as an optimal passage.This merger strategy does not attempt to leverage the complementaryinformation in the two passages. The systems and methods mergingstrategy captures the distribution of alignment across passages.

In the systems and methods herein, the framework is general in threeways: 1) It is independent on the type of passage scorer; 2) more matrixoperations (like rank(M)), may be easily added to the definition offunction g(M) and h(M); and 3) there is value in considering how allcandidates match with the question. Using the framework of the systemsand methods herein, all that is required, is adding a new dimension:depth to the matrix M, thus giving rise to a 3-D matrix, say M3D. Eachtwo dimensional M in M3D belongs to a candidate answer. Therefore, depthof M3D is the number of candidate answers per question.

Question/answer (QA) systems analyze a question and generate multiplepossible candidate answers. Then QA systems apply many differentanswer-scoring methods, each of which produces features that are used toevaluate whether the answer is correct. One way in which question/answersystems evaluates candidate answers is to first retrieve passages oftext that contain the candidate answer. Each passage is then scoredusing a variety of methods called passage scorers. All of the featuresare sent to a final merging and ranking component, which usesmachine-learning techniques to weigh and combine features to produce asingle confidence value estimating the probability that the candidateanswer is correct. The features the systems and methods herein introduceare extracted and made available to the machine learning model in thefinal merging and ranking component, where the scores assigned bydifferent passage scorers are available.

The systems and methods herein find candidate answers, employing avariety of methods to compute feature values relating to those answers,and then use a statistical classifier to determine which answers arecorrect. For a given question Q, search components find a set ofcandidate answers {CA₁, CA₂, . . . , CA_(n)}. The task of the classifieris to decide which of the candidate answers is the correct answer. Eachcandidate answer is associated with one or more passages that containthe answer and that the search component has determined may be relevantto the question. A subset of the methods that compute feature values arethe passage scoring components. These components evaluate the evidencethat a single passage provides relating to how well the answer satisfiesthe requirements of the question. Thus among the feature valuesassociated with a candidate answer will be some passage scoringfeatures.

The systems and methods herein have a variety of passage scoring methodsthat use different strategies for determining which parts of a questionto attempt to match to each part of a passage and for determiningwhether two parts of a passage match. Some attempt to align questionterms to passage terms using syntactic structure and/or semanticrelations, while others use word order or ignore the relationship amongterms completely (e.g., simply counting how many question terms appearin the passage, regardless of whether those terms are similarlyarranged).

Many scorers score a passage by comparing it to the question. Thescorers leverage on available annotation components developed for thequestion/answer systems framework, such as dependency parsing, namedentity recognition, coreference resolution, relation detection, etc. Thequestion and the passage are decomposed into sets of terms, where a termcan either be a single token, a multiword term or a proper name. All ofthese scorers try to determine the amount of overlap between the passageand the question by looking at which terms match. The individual scorersput different restrictions on when a term is considered to match. Thepassage scores are composed using the inverse document frequencies ofthe matching terms. The inverse document frequency (idf) is consideredto be a measure of a term's information content using the assumptionthat terms that occur less often are more important. It is computed as

${{{idf}({term})} = {\log\;\frac{N}{n_{term} + 1}}},$where N is the number of documents of a large corpus and n_(term) is thenumber of documents that contain the term.

Some exemplary scorers used by systems and methods herein include apassage term match scorer that assigns a score based on which questionterms are included in the passage, regardless of word order orgrammatical relationship; a scorer that assigns a score based on whetherpairs of terms that are connected or nearly connected in thesyntactic-semantic structure of the question match corresponding pairsof terms in the passage; a textual alignment scorer that assigns a scorebased on how well the word order of the passage aligns with that of thequestion, when the focus is replaced with the candidate answer; alogical form answer candidate scorer that targets high-precisionmatching between the syntactic structures of passages and questions,etc.

Each passage-scoring component produces a fixed number of features foreach passage. Some of these features range from 0 to 1, where a highscore indicates that the passage matches the question well; otherpassage scorers have other ranges. The passage scorers assign a score topassages for a candidate answer. The answer classifier is intended toclassify candidate answers that generally have multiple and differentnumber of passages. Consequently, the systems and methods herein use amerger to combine passage scores for <candidate answer, passage> pairsinto a fixed set of features. For example, if a candidate answer hasthree passages and a passage scorer assigns a value of 0.5, 0.6, and 0.7to each passage, these scores may be merged using a merger strategy likeSUM-THEN-MAX. Using this merger strategy, the feature added to thelearning model for the candidate answer under consideration will beSUM-THEN-MAX (0.5, 0.6, 0.7)=0.7.

The systems and methods herein use distinct methods that are used tomerge features across passages including, for example:

1. Maximum: The final score for the candidate answer is the maximumscore for that answer in any passages found for that answer.

2. Sum: The final score for the candidate answer is the sum of thescores for that answer in each of the passages found for that answer.

3. Decaying sum: The final score for the candidate answer is computed tobe

${\sum\limits_{i = 0}^{m}\frac{pi}{2i}},$where p₀, p₁, . . . , p_(m) are the scores of the passages that containthe answers, sorted in descending order.

Passage scorers capture limited complementary information that thepassages have to offer. For example, in FIG. 2, a passage scoringcomponent may assign scores s_(1.1),s_(1.2) to passages P1.1 and P1.2,respectively. A merger strategy that takes maximum across passages willchoose MAX(s_(1.1),s_(1.2)) as the optimal supporting passage. However,since these passages have complementary information to offer, it can beadvantageous to aggregate this information, and the systems and methodsherein do so.

For each of the scoring features, the systems and methods herein derivetwo other features: a feature for whether a feature is missing and astandardized version of a feature. When the value of a feature ismissing, the systems and methods herein assert a value of 0 for thefeature and a value of 1 for the corresponding derived missing feature;this allows the learner to distinguish between cases where the featureactually has 0 value versus cases where it simply did not apply at all.The standardized version of a feature is computed by subtracting themean value of that feature and dividing by the standard deviation forthat feature. Both mean and standard deviation are computed across allanswers to a single question, not across all answers to all questions inthe test set. The purpose of the standardized feature is to encode howmuch the base feature differs from a typical value of that feature for asingle question.

In Table 3 (shown in FIG. 4) the systems and methods herein present someexamples of the features the systems and methods herein introduce. Theseare also the features that achieve a high correlation with the goldclass in the training set, which the systems and methods herein callinform analysis. In general, mean, transpose-max-then-sum,transpose-nonZeroColumns perform well for other passage scorers as well.

The foregoing description presents a more narrow, specific introductionand the following presents a more general, broadly applicableelaboration. As shown above, methods and systems herein analyze an inputquestion and generate and evaluate candidate answers using a variety oftechniques. These features are then combined based on their learnedpotential for predicting the right answer. The final result of thisprocess is a ranked list of candidate answers, each with a confidencescore indicating the degree to which the answer is believed correct,along with links back to the evidence.

A question answering system may divide up the task of answeringquestions in a variety of ways. For example, for any single answer itmight have multiple pieces of evidence regarding that answer. Somepieces of evidence may relate to the entire question while other piecesof evidence may only provide information relevant to part of thequestion. A deep evidence scoring component can provide a scoreindicating how well that piece of evidence relates to some part of thequestion. A question answering system that uses statistical machinelearning to classify candidate answers as correct or incorrect needs toultimately conclude how well all of the evidence relates to all of thequestion. The systems and methods herein bridge this gap. The systemsand methods herein take as input, feature values that relate portions oraspects of a question to portions or aspects of evidence and combinesthose feature values using a variety of functions to produce featurevalues that are relevant to whether the candidate answer correctlyanswers the question.

In the example shown in FIG. 3, there is some evidence to support eachof the two answers (African Elephant and Hippo). However, there is moreevidence for the former; in particular, there is no evidence for a hippohaving ears that are large. An evidence scoring component may recognizethat and give African Elephant a high score for having large ears inpassage P1.2 while giving Hippo a low score for having large ears inpassage P2.2. The systems and methods herein provides a mechanism forcombining results that are specific to particular parts of a questionand particular pieces of evidence into feature values that can beassigned to a candidate answer. The systems and methods herein operateby combining across feature values, and across passages, and acrossparts of the question in different ways.

Question/answer systems final merger aggregates information acrossmultiple pieces of evidence, each analyzed using different analytics(features). However, the final merger process requires that each ofthese features be reduced first to a single number and only thencombined into a final score for a candidate. Thus, the combination intoa single score for a candidate is limited to combining the total amountof evidence that each passage provides, without regards to what portionsof the questions each piece of evidence addressed. For example,conventional final merger will not distinguish between an answer thathas two passages that provide very strong evidence for one importantpart of a question from one that has one passage that provides verystrong evidence for one very important part of the question and anotherpassage that provides very strong evidence for another very importantpart of the question. In addition, since the systems and methods hereincombine results from different features across these dimensions, thesystems and methods herein can accommodate interactions among featuresthat only make sense within a given dimension.

For example, a feature indicating whether a passage supports the answerand a feature indicating whether a passage comes from a reliable sourceare only meaningfully combined at the level of passages, and the systemsand methods herein make this possible. To the contrary, conventionalfinal merger processing combines the reliability feature into a singlevalue (and thus determining whether the passages, as a whole, arereliable) and each answering feature to a single value (and thusdetermining whether the passages, as a whole, support this answer). Thisrigid collapsing mechanism in the conventional final merger processingmakes final merger processing unable to access information about whetherthe passages that support the answer are the same ones that arereliable.

Further, the approach of the systems and methods herein is domainindependent, and the systems and methods herein combine multiple piecesof evidence (possibly assessed using different analytics) for the samefact within a decomposed question.

For each question-answer pair, the systems and methods herein takes asinput a multi-dimensional feature value matrix. One dimension of theinput matrix is labels corresponding to answer scoring features (i.e.,labels indicating types of analysis that may be useful for determiningwhether an answer is correct). There is at least one other dimension forthis matrix. Other dimensions may include (but are not limited to)portions of the question and pieces evidence (e.g., passages) that provewhether the answer is correct. The values in the matrix are featurevalues for the features specified in the feature label dimension. Thoseinput values are computed by an evidence-scoring component, which usesevidence to determine how well a candidate answer satisfies requirementsestablished in a question. The output of the systems and methods hereincan be a single dimensional vector where the one dimension is answerscoring feature labels and the values in the vector are computed bycombining values from across other dimensions.

Because the output of the systems and methods herein can be a singledimensional vector of features and values, it satisfies the inputrequirements for a traditional machine learning classifier. Since thereis one such vector for each question-answer pair, such a classifier canbe used to classify answers as correct or incorrect.

Because the systems and methods herein can have more output featuresthan input features, this can result in a system that has more featuresthan are immediately useful (e.g., can result in models that are overfitto the test set). In that case, a statistical method can select only asubset of the features (using machine learning technology). Providingmany output features is useful to the effectiveness of the systems andmethods herein, even if only a subset of those features is selected. Itis difficult for a human to predict what matrix operations will be mosteffective at collapsing a complex combination of evidence relating todifferent parts or aspects of a question down into a single numericalscore. The automated feature selection mechanism of systems and methodsherein can do this job effectively if it is given results from a broadselection of such operations and is allowed so select among themempirically.

An exemplary processing is set forth below. In the following, shorthandfeatures are defined as follows. SET_T is a predefined set oftransformation functions, which take a single value along with anyamount of additional information that provides context for that value,and produce a single value. SET_C is a predefined set of collapsingfunctions, which take a vector and produce a single value MAP_C, a mapfrom matrix dimensions to subsets of SET_C. SET_D is a set of matrixdimensions. SET_P is a set of permutations of subsets of SET_D. Therun-time inputs include Minput, which is a matrix of candidate answerswith multi-dimensional feature matrices (obtained using enabling artsuch as Watson deep evidence scoring components).

The exemplary process is as follows:

Matrix Mstart=Expand Minput using SET_T for each P in SET_P;

M(P,0)=Mstart;

for i=1 to the length of P;

Dimension D(P,i)=entry i in P;

Set SUBSET_C(P,i)=the MAP_C entry for D(P,i);

Matrix M(P,i)=A new matrix with all the dimensions of M(P,i−1) and theindexes of D(P,i) replaced by new indexes corresponding to the elementsin SUBSET_C(P,i) For each C in SUBSET_C(P,i);

For each Tuple T(P,i−1) of index values for M(P,i−1) for all of itsdimensions other than D(P,i) Tuple T(P,i)=T(P,i−1) with the value of theD(P,i) index replaced by a new feature index unique to C and D(P,i);

The entry in M(P,i) whose indexes are T(i) is set to the value computedby applying C to the vector of values of M(P,i−1) whose indexes are T(i)with all index values for D(i);

Vfinal=flatten the final M(P,i) matrices for each P and concatenate allof them.

The output of this process, Vfinal is a feature vector that can be usedfor classifying candidate answers as right or wrong (using establishedML technology). The transformation of the input can use a predefined setof transformation functions. One example of a transformation function isIDENTITY, which ignores the context and returns its input value. Anotheris STANDARDIZE, which requires as context all other values for the samefeature label for all answers to the question; it transforms a singlevalue by subtracting the mean of all of the context values and dividingby the standard deviation of the context values.

The steps above also can use predefined sets of collapsing functions. Acollapsing function is a function that takes a one-dimensional vector ofnumbers and returns a single number. Examples of collapsing functionsinclude MAXIMUM (the maximum value of any number in the vector),MINIMUM, SUM, MEAN, MEDIAN, and STANDARD DEVIATION. Another usefulexample is the number of non-zero entries. A more complex but alsouseful example is a machine learning classifier.

Operations of the systems and methods herein are shown in the followingexample, based on FIG. 3. Two candidate answers each have a matrixrelating parts of the question (in the example, individual terms areparts of the question, but they could be groups of terms, concepts,relationships, etc.) and passages (which could be some other kind ofevidence). Thus in the example, each candidate answer has athree-dimensional matrix for features: feature label, passage, andquestion term.

Here is an example of how matrix Minput (from the method, above) wouldlook for the African Elephant

large Land animal has large ears Feature 1 P1.1 1 2 3 0 0 0 P1.2 1 0 4 56 7 Feature 2 P1.1 9 8 7 0 0 0 P1.2 0 0 6 5 4 3

Here is how Minput would look for Hippo:

large Land animal has large ears Feature 1 P2.1 1 2 3 0 0 0 P2.2 0 0 4 50 7 Feature 2 P2.1 9 8 7 0 0 0 P2.2 0 0 6 5 −10 3

As an example, consider P=[Passage,Question Term] and that the onlytransformation function is IDENTITY (i.e., Mstart=Minput=M(P,0)). D(P,1)is the Passage dimension. The values for T(0) are thus tuples offeatures and question terms, e.g., (feature1,animal). The values forT(1) are the same tuples with features replaced by new features specificto the Passage dimension and one of the collapsing functions. Assumethat the collapsing functions are SUM, MAXIMUM, and MINIMUM. For thepurpose of this example the new feature derived from feature1 is labeledthe Passage dimension, and the SUM function as feature1_SUM(Passage).Thus, the T(1) tuples corresponding to value (feature1,animal) for T(0)would be the tuples (feature1_SUM(Passage),animal) and(feature1_MAXIMUM(Passage),animal). The entry for the former is computedas the sum across all values of Passage (i.e., P1.1 and P1.2) of allentries in M(P,0) such that the feature is feature1 and the questionterm is “animal”. The sum of 3 and 4 is 7, so the entry in M(P,1) forindexes feature1_SUM(Passage) and animal is set to 7. Similarly, themaximum of 3 and 4 is 4, so the entry in M(P,1) for indexesfeature1_MAXIMUM(Passage) and animal is set to 4. The complete resultsfor M(P,1) for African Elephant are:

African Elephant large land animal has large ears feature1 SUM(Passage)2 2 7 5 6 7 MAXIMUM(Passage) 1 2 4 5 6 7 MINIMUM(Passage) 0 0 3 0 0 0feature2 SUM(Passage) 9 8 13 5 4 3 MAXIMUM(Passage) 9 8 7 5 4 3MINIMUM(Passage) 0 0 6 0 0 0

Similarly, the complete results for M(P,1) for Hippo are:

Hippo large land animal has large ears feature1 SUM(Passage) 1 2 7 5 0 7MAXIMUM(Passage) 1 2 4 5 0 7 MINIMUM(Passage) 0 0 3 0 0 0 feature2SUM(Passage) 9 8 13 5 −10 3 MAXIMUM(Passage) 9 8 7 5 0 3MINIMUM(Passage) 0 0 6 0 −10 0

D(P,2) is the Question Term dimension. Since the only dimensions in M(1)are now feature labels and question terms, the new values for T(1) arethus atomic feature labels, e.g., feature1_SUM(Passage). The T(2) tuplesare thus also atomic feature labels derived from those feature labels,the question term dimension, and the collapsing functions. For example,one instance of T(2) is labeled as feature1_SUM(Passage)_SUM(QuestionTerm). The value in M(P,2) for thefeature1_SUM(Passage)_SUM(QuestionTerm) index is the sum across allquestion terms of all values in M(P,1) for which the feature isfeature1_SUM(Passage), i.e., 2+2+7+5+6+7, which equals 29. Applying thislogic to M(P,2) for both answers produces:

African Elephant SUM MAXIMUM MINIMUM (Question (Question (Question Term)Term) Term) feature1 feature1 SUM(Passage) 29 7 2 MAXIMUM(Passage) 25 71 MINIMUM(Passage) 3 3 0 feature2 SUM(Passage) 42 13 3 MAXIMUM(Passage)36 9 3 MINIMUM(Passage) 6 6 0

Hippo SUM MAXIMUM MINIMUM (Question (Question (Question Term) Term)Term) feature1 SUM(Passage) 22 7 0 MAXIMUM(Passage) 19 7 0MINIMUM(Passage) 3 3 0 feature2 SUM(Passage) 28 13 −10 MAXIMUM(Passage)38 9 0 MINIMUM(Passage) 6 6 −10

Since P has length 2, M(P,2) is flattened. The elements in the flattenedvector are labeled with the combined labels of all of the remainingdimensions, e.g.:

African Elephant:

feature1_SUM(Passage)_SUM(Question Term): 29

feature1_SUM(Passage)_MAXIMUM(Question Term): 7

feature1_SUM(Passage)_MINIMUM(Question Term): 2

feature1_MAXIMUM(Passage)_SUM(Question Term): 25

feature1_MAXIMUM(Passage)_MAXIMUM(Question Term): 7

feature1_MAXIMUM(Passage)_MINIMUM(Question Term): 1

feature1_MINIMUM(Passage)_SUM(Question Term): 3

feature1_MINIMUM(Passage)_MAXIMUM(Question Term): 3

feature1_MINIMUM(Passage)_MINIMUM(Question Term): 0

feature2_SUM(Passage)_SUM(Question Term): 42

feature2_SUM(Passage)_MAXIMUM(Question Term): 13

feature2_SUM(Passage)_MINIMUM(Question Term): 3

feature2_MAXIMUM(Passage)_SUM(Question Term): 36

feature2_MAXIMUM(Passage)_MAXIMUM(Question Term): 9

feature2_MAXIMUM(Passage)_MINIMUM(Question Term): 3

feature2_MINIMUM(Passage)_SUM(Question Term): 6

feature2_MINIMUM(Passage)_MAXIMUM(Question Term): 6

feature2_MINIMUM(Passage)_MINIMUM(Question Term): 0

Hippo:

feature1_SUM(Passage)_SUM(Question Term): 22

feature1_SUM(Passage)_MAXIMUM(Question Term): 7

feature1_SUM(Passage)_MINIMUM(Question Term): 0

feature1_MAXIMUM(Passage)_SUM(Question Term): 19

feature1_MAXIMUM(Passage)_MAXIMUM(Question Term): 7

feature1_MAXIMUM(Passage)_MINIMUM(Question Term): 0

feature1_MINIMUM(Passage)_SUM(Question Term): 3

feature1_MINIMUM(Passage)_MAXIMUM(Question Term): 3

feature1_MINIMUM(Passage)_MINIMUM(Question Term): 0

feature2_SUM(Passage)_SUM(Question Term): 28

feature2_SUM(Passage)_MAXIMUM(Question Term): 13

feature2_SUM(Passage)_MINIMUM(Question Term): −10

feature2_MAXIMUM(Passage)_SUM(Question Term): 38

feature2_MAXIMUM(Passage)_MAXIMUM(Question Term): 9

feature2_MAXIMUM(Passage)_MINIMUM(Question Term): 0

feature2_MINIMUM(Passage)_SUM(Question Term): 6

feature2_MINIMUM(Passage)_MAXIMUM(Question Term): 6

feature2_MINIMUM(Passage)_MINIMUM(Question Term): −10

These features are then be used by a classifier to judge which of thetwo answers is correct, using answer classification mechanisms. Sincethe number of output features is much larger than the number of inputfeatures, feature selection methods can choose a subset of the features.

An exemplary embodiment herein is shown as a method in the flowchart inFIG. 5. In item 400, this method receives a question (comprisingquestion terms) into a user interface of a computerized device. Suchmethods/systems also automatically search sources of data 402 containingpassages (e.g., text passages) using a processor of the computerizeddevice to produce candidate answers to the question. The searching isbased on the question terms, and the searching identifies sources ofevidence 406 that support each of the candidate answers 404 based onscoring features that indicate whether the candidate answers are correctanswers to the question.

Such methods/systems thus analyze an input question to determineprecisely what it is asking for and generate many possible candidateanswers through a broad search of large volumes of content. For each ofthese candidate answers, a hypothesis is formed based on considering thecandidate in the context of the original question and topic. For eachhypothesis, methods and system herein spawn an independent thread thatattempts to prove the hypothesis. The systems/methods search the contentsources for evidence that supports or refutes each hypothesis. For eachevidence-hypothesis pair, such systems/methods apply hundreds of methodsthat dissect and analyze the evidence along different scoring featuressuch as type classification, time, geography, popularity, passagesupport, source reliability, and semantic relatedness. This analysisproduces hundreds of such scoring features (which are sometimes referredto herein simply as “features”). Thus, the scoring features define thetypes of analysis that are used to determine whether the candidateanswers are correct. The results of the search can be transformed byusing a transformation function that takes an input value andinformation that provides context for the input value, and produces atransformed score value 408.

Such methods/systems automatically create a scoring feature-specificmatrix for each scoring feature using the processor 410. Each scoringfeature-specific matrix has a score field for each different combinationof text passage and question term, and each score field holds a scorevalue (vector value) indicating how each different combination ofpassage and question term supports the candidate answers as being acorrect answer to the question (with respect to the correspondingscoring feature of the matrix). The different combinations of passagesand question terms thereby form vectors, and each vector has a vectorvalue equal to the corresponding score value.

The systems/methods herein automatically combine (collapse) the vectorsin item 412 using the processor by calculating a statistical measure ofthe vectors (e.g., a collapsing function, such as calculating a maximum,a minimum, a sum, a mean, a median, and/or a standard-deviation of thevectors) to produce a collapsed score for each of the question terms.Further, the systems/methods herein automatically combine the collapsedscore for each question term using the processor to produce a combinedscore for each of the candidate answers. Then the systems and methodsherein can automatically rank the candidate answers in item 414 based onthe combined score for each candidate answer using the processor. Initem 416, a single answer can be output from the user interface, or manyanswers (with their scores and evidence sources) can be output.

As will be appreciated by one skilled in the art, aspects of the systemsand methods herein may be embodied as a system, method or computerprogram product. Accordingly, aspects of the present disclosure may takethe form of an entirely hardware embodiment, an entirely softwareembodiment (including firmware, resident software, micro-code, etc.) oran embodiment combining software and hardware aspects that may allgenerally be referred to herein as a “circuit,” “module” or “system.”Furthermore, aspects of the present disclosure may take the form of acomputer program product embodied in one or more computer readablemedium(s) having computer readable program code embodied thereon.

Any combination of one or more computer readable non-transitorymedium(s) may be utilized. The computer readable medium may be acomputer readable signal medium or a computer readable storage medium.The non-transitory computer storage medium stores instructions, and aprocessor executes the instructions to perform the methods describedherein. A computer readable storage medium may be, for example, but notlimited to, an electronic, magnetic, optical, electromagnetic, infrared,or semiconductor system, apparatus, or device, or any suitablecombination of the foregoing. More specific examples (a non-exhaustivelist) of the computer readable storage medium would include thefollowing: an electrical connection having one or more wires, a portablecomputer diskette, a hard disk, a random access memory (RAM), aread-only memory (ROM), an erasable programmable read-only memory (EPROMor Flash memory), an optical fiber, a portable compact disc read-onlymemory (CD-ROM), an optical storage device, a magnetic storage device,or any suitable combination of the foregoing. In the context of thisdocument, a computer readable storage medium may be any tangible mediumthat can contain, or store a program for use by or in connection with aninstruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signalwith computer readable program code embodied therein, for example, inbaseband or as part of a carrier wave. Such a propagated signal may takeany of a variety of forms, including, but not limited to,electro-magnetic, optical, or any suitable combination thereof. Acomputer readable signal medium may be any computer readable medium thatis not a computer readable storage medium and that can communicate,propagate, or transport a program for use by or in connection with aninstruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmittedusing any appropriate medium, including but not limited to wireless,wireline, optical fiber cable, RF, etc., or any suitable combination ofthe foregoing.

Computer program code for carrying out operations for aspects of thepresent disclosure may be written in any combination of one or moreprogramming languages, including an object oriented programming languagesuch as Java, Smalltalk, C++ or the like and conventional proceduralprogramming languages, such as the “C” programming language or similarprogramming languages. The program code may execute entirely on theuser's computer, partly on the user's computer, as a stand-alonesoftware package, partly on the user's computer and partly on a remotecomputer or entirely on the remote computer or server. In the latterscenario, the remote computer may be connected to the user's computerthrough any type of network, including a local area network (LAN) or awide area network (WAN), or the connection may be made to an externalcomputer (for example, through the Internet using an Internet ServiceProvider).

Aspects of the present disclosure are described below with reference toflowchart illustrations and/or block diagrams of methods, apparatus(systems) and computer program products according to embodiments herein.It will be understood that each block of the flowchart illustrationsand/or two-dimensional block diagrams, and combinations of blocks in theflowchart illustrations and/or block diagrams, can be implemented bycomputer program instructions. These computer program instructions maybe provided to a processor of a general purpose computer, specialpurpose computer, or other programmable data processing apparatus toproduce a machine, such that the instructions, which execute via theprocessor of the computer or other programmable data processingapparatus, create means for implementing the functions/acts specified inthe flowchart and/or block diagram block or blocks.

The computer program instructions may also be loaded onto a computer,other programmable data processing apparatus, or other devices to causea series of operational steps to be performed on the computer, otherprogrammable apparatus or other devices to produce a computerimplemented process such that the instructions which execute on thecomputer or other programmable apparatus provide processes forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks.

A representative hardware environment for practicing the embodimentsherein is depicted in FIG. 6. This schematic drawing illustrates ahardware configuration of an information handling/computer system inaccordance with the embodiments herein. The system comprises at leastone processor or central processing unit (CPU) 10. The CPUs 10 areinterconnected via system bus 12 to various devices such as a randomaccess memory (RAM) 14, read-only memory (ROM) 16, and an input/output(I/O) adapter 18. The I/O adapter 18 can connect to peripheral devices,such as disk units 11 and tape drives 13, or other program storagedevices that are readable by the system. The system can read theinventive instructions on the program storage devices and follow theseinstructions to execute the methodology of the embodiments herein. Thesystem further includes a user interface adapter 19 that connects akeyboard 15, mouse 17, speaker 24, microphone 22, and/or other userinterface devices such as a touch screen device (not shown) to the bus12 to gather user input. Additionally, a communication adapter 20connects the bus 12 to a data processing network 25, and a displayadapter 21 connects the bus 12 to a display device 23 which may beembodied as an output device such as a monitor, printer, or transmitter,for example.

The flowchart and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods and computer program products according to variousembodiments herein. In this regard, each block in the flowchart or blockdiagrams may represent a module, segment, or portion of code, whichcomprises one or more executable instructions for implementing thespecified logical function(s). It should also be noted that, in somealternative implementations, the functions noted in the block may occurout of the order noted in the figures. For example, two blocks shown insuccession may, in fact, be executed substantially concurrently, or theblocks may sometimes be executed in the reverse order, depending uponthe functionality involved. It will also be noted that each block of theblock diagrams and/or flowchart illustration, and combinations of blocksin the block diagrams and/or flowchart illustration, can be implementedby special purpose hardware-based systems that perform the specifiedfunctions or acts, or combinations of special purpose hardware andcomputer instructions.

Deployment types include loading directly in the client, server andproxy computers via loading a storage medium such as a CD, DVD, etc. Theprocess software may also be automatically or semi-automaticallydeployed into a computer system by sending the process software to acentral server or a group of central servers. The process software isthen downloaded into the client computers that will execute the processsoftware. The process software is sent directly to the client system viae-mail. The process software is then either detached to a directory orloaded into a directory by a button on the e-mail that executes aprogram that detaches the process software into a directory. Send theprocess software directly to a directory on the client computer harddrive. When there are proxy servers, the process will, select the proxyserver code, determine on which computers to place the proxy servers'code, transmit the proxy server code, and then install the proxy servercode on the proxy computer. The process software will be transmitted tothe proxy server then stored on the proxy server.

While it is understood that the process software may be deployed bymanually loading directly in the client, server and proxy computers vialoading a storage medium such as a CD, DVD, etc., the process softwaremay also be automatically or semi-automatically deployed into a computersystem by sending the process software to a central server or a group ofcentral servers. The process software is then downloaded into the clientcomputers that will execute the process software. Alternatively theprocess software is sent directly to the client system via e-mail. Theprocess software is then either detached to a directory or loaded into adirectory by a button on the e-mail that executes a program thatdetaches the process software into a directory. Another alternative isto send the process software directly to a directory on the clientcomputer hard drive. When there are proxy servers, the process will,select the proxy server code, determine on which computers to place theproxy servers' code, transmit the proxy server code, then install theproxy server code on the proxy computer. The process software will betransmitted to the proxy server then stored on the proxy server.

In FIG. 7, Step 100 begins the deployment of the process software. Thefirst thing is to determine if there are any programs that will resideon a server or servers when the process software is executed 101. Ifthis is the case then the servers that will contain the executables areidentified 209. The process software for the server or servers istransferred directly to the servers' storage via FTP or some otherprotocol or by copying through the use of a shared file system 210. Theprocess software is then installed on the servers 211. Next, adetermination is made on whether the process software is be deployed byhaving users access the process software on a server or servers 102. Ifthe users are to access the process software on servers then the serveraddresses that will store the process software are identified 103.

A determination is made if a proxy server is to be built 200 to storethe process software. A proxy server is a server that sits between aclient application, such as a Web browser, and a real server. Itintercepts all requests to the real server to see if it can fulfill therequests itself. If not, it forwards the request to the real server. Thetwo primary benefits of a proxy server are to improve performance and tofilter requests. If a proxy server is required then the proxy server isinstalled 201. The process software is sent to the servers either via aprotocol such as FTP or it is copied directly from the source files tothe server files via file sharing 202.

Another embodiment would be to send a transaction to the servers thatcontained the process software and have the server process thetransaction, then receive and copy the process software to the server'sfile system. Once the process software is stored at the servers, theusers via their client computers, then access the process software onthe servers and copy to their client computers file systems 203. Anotherembodiment is to have the servers automatically copy the processsoftware to each client and then run the installation program for theprocess software at each client computer. The user executes the programthat installs the process software on his client computer 212 then exitsthe process 108.

In step 104 a determination is made whether the process software is tobe deployed by sending the process software to users via e-mail. The setof users where the process software will be deployed are identifiedtogether with the addresses of the user client computers 105. Theprocess software is sent via e-mail 204 to each of the users' clientcomputers. The users then receive the e-mail 205 and then detach theprocess software from the e-mail to a directory on their clientcomputers 206. The user executes the program that installs the processsoftware on his client computer 212 then exits the process 108.

Lastly, a determination is made on whether to the process software willbe sent directly to user directories on their client computers 106. Ifso, the user directories are identified 107. The process software istransferred directly to the user's client computer directory 207. Thiscan be done in several ways such as but not limited to sharing of thefile system directories and then copying from the sender's file systemto the recipient user's file system or alternatively using a transferprotocol such as File Transfer Protocol (FTP). The users access thedirectories on their client file systems in preparation for installingthe process software 208. The user executes the program that installsthe process software on his client computer 212 then exits the process108.

The process software is integrated into a client, server and networkenvironment by providing for the process software to coexist withapplications, operating systems and network operating systems softwareand then installing the process software on the clients and servers inthe environment where the process software will function.

The first step is to identify any software on the clients and serversincluding the network operating system where the process software willbe deployed that are required by the process software or that work inconjunction with the process software. This includes the networkoperating system that is software that enhances a basic operating systemby adding networking features.

Next, the software applications and version numbers will be identifiedand compared to the list of software applications and version numbersthat have been tested to work with the process software. Those softwareapplications that are missing or that do not match the correct versionwill be upgraded with the correct version numbers. Program instructionsthat pass parameters from the process software to the softwareapplications will be checked to ensure the parameter lists matches theparameter lists required by the process software. Conversely, parameterspassed by the software applications to the process software will bechecked to ensure the parameters match the parameters required by theprocess software. The client and server operating systems including thenetwork operating systems will be identified and compared to the list ofoperating systems, version numbers and network software that have beentested to work with the process software. Those operating systems,version numbers and network software that do not match the list oftested operating systems and version numbers will be upgraded on theclients and servers to the required level.

After ensuring that the software, where the process software is to bedeployed, is at the correct version level that has been tested to workwith the process software, the integration is completed by installingthe process software on the clients and servers.

In FIG. 8, Step 220 begins the integration of the process software. Thefirst thing is to determine if there are any process software programsthat will execute on a server or servers 221. If this is not the case,then integration proceeds to 227. If this is the case, then the serveraddresses are identified 222. The servers are checked to see if theycontain software that includes the operating system (OS), applications,and network operating systems (NOS), together with their versionnumbers, that have been tested with the process software 223. Theservers are also checked to determine if there is any missing softwarethat is required by the process software 223.

A determination is made if the version numbers match the version numbersof OS, applications and NOS that have been tested with the processsoftware 224. If all of the versions match and there is no missingrequired software the integration continues in 227.

If one or more of the version numbers do not match, then the unmatchedversions are updated on the server or servers with the correct versions225. Additionally, if there is missing required software, then it isupdated on the server or servers 225. The server integration iscompleted by installing the process software 226.

Step 227, which follows either 221, 224 or 226 determines if there areany programs of the process software that will execute on the clients.If no process software programs execute on the clients, the integrationproceeds to 230 and exits. If this not the case, then the clientaddresses are identified 228.

The clients are checked to see if they contain software that includesthe operating system (OS), applications, and network operating systems(NOS), together with their version numbers, that have been tested withthe process software 229. The clients are also checked to determine ifthere is any missing software that is required by the process software229.

A determination is made as to whether the version numbers match theversion numbers of OS, applications and NOS that have been tested withthe process software 231. If all of the versions match and there is nomissing required software, then the integration proceeds to 230 andexits.

If one or more of the version numbers do not match, then the unmatchedversions are updated on the clients with the correct versions 232. Inaddition, if there is missing required software then it is updated onthe clients 232. The client integration is completed by installing theprocess software on the clients 233. The integration proceeds to 230 andexits.

The process software can be stored on a shared file system accessiblefrom one or more servers. The process software is executed viatransactions that contain data and server processing requests that useCPU units on the accessed server. CPU units are units of time such asminutes, seconds, hours on the central processor of the server.Additionally, the assessed server may make requests of other serversthat require CPU units. CPU units are an example that represents but onemeasurement of use. Other measurements of use include but are notlimited to network bandwidth, memory usage, storage usage, packettransfers, complete transactions etc. When multiple customers use thesame process software application, their transactions are differentiatedby the parameters included in the transactions that identify the uniquecustomer and the type of service for that customer. All of the CPU unitsand other measurements of use that are used for the services for eachcustomer are recorded. When the number of transactions to any one serverreaches a number that begins to affect the performance of that server,other servers are accessed to increase the capacity and to share theworkload. Likewise, when other measurements of use, such as networkbandwidth, memory usage, storage usage, etc., approach a capacity so asto affect performance, additional network bandwidth, memory usage,storage etc. are added to share the workload. The measurements of useused for each service and customer are sent to a collecting server thatsums the measurements of use for each customer for each service that wasprocessed anywhere in the network of servers that provide the sharedexecution of the process software. The summed measurements of use unitsare periodically multiplied by unit costs and the resulting totalprocess software application service costs are alternatively sent to thecustomer and or indicated on a web site accessed by the customer whichthen remits payment to the service provider. In another embodiment, theservice provider requests payment directly from a customer account at abanking or financial institution. In another embodiment, if the serviceprovider is also a customer of the customer that uses the processsoftware application, the payment owed to the service provider isreconciled to the payment owed by the service provider to minimize thetransfer of payments.

The process software is shared, simultaneously serving multiplecustomers in a flexible, automated fashion. It is standardized,requiring little customization and it is scalable, providing capacity ondemand in a pay-as-you-go model.

In FIG. 9, Step 240 begins the On Demand process. A transaction iscreated than contains the unique customer identification, the requestedservice type and any service parameters that further specify the type ofservice 241. The transaction is then sent to the main server 242. In anOn Demand environment the main server can initially be the only server,then as capacity is consumed other servers are added to the On Demandenvironment.

The server central processing unit (CPU) capacities in the On Demandenvironment are queried 243. The CPU requirement of the transaction isestimated, then the servers available CPU capacity in the On Demandenvironment are compared to the transaction CPU requirement to see ifthere is sufficient CPU available capacity in any server to process thetransaction 244. If there is not sufficient server CPU availablecapacity, then additional server CPU capacity is allocated to processthe transaction 248. If there was already sufficient available CPUcapacity then the transaction is sent to a selected server 245.

Before executing the transaction, a check is made of the remaining OnDemand environment to determine if the environment has sufficientavailable capacity for processing the transaction. This environmentcapacity consists of such things as but not limited to networkbandwidth, processor memory, storage etc. 246. If there is notsufficient available capacity, then capacity will be added to the OnDemand environment 247. Next the required software to process thetransaction is accessed, loaded into memory, then the transaction isexecuted 249.

The usage measurements are recorded 250. The usage measurements consistof the portions of those functions in the On Demand environment that areused to process the transaction. The usage of such functions as, but notlimited to, network bandwidth, processor memory, storage and CPU cyclesare what is recorded. The usage measurements are summed, multiplied byunit costs and then recorded as a charge to the requesting customer 251.If the customer has requested that the On Demand costs be posted to aweb site 252 then they are posted 253.

If the customer has requested that the On Demand costs be sent viae-mail to a customer address 254 then they are sent 255. If the customerhas requested that the On Demand costs be paid directly from a customeraccount 256 then payment is received directly from the customer account257. The last step is to exit the On Demand process 258.

The process software may be deployed, accessed and executed through theuse of a virtual private network (VPN), which is any combination oftechnologies that can be used to secure a connection through anotherwise unsecured or untrusted network. The use of VPNs is to improvesecurity and for reduced operational costs. The VPN makes use of apublic network, usually the Internet, to connect remote sites or userstogether. Instead of using a dedicated, real-world connection such asleased line, the VPN uses “virtual” connections routed through theInternet from the company's private network to the remote site oremployee.

The process software may be deployed, accessed and executed througheither a remote-access or a site-to-site VPN. When using theremote-access VPNs the process software is deployed, accessed andexecuted via the secure, encrypted connections between a company'sprivate network and remote users through a third-party service provider.The enterprise service provider (ESP) sets a network access server (NAS)and provides the remote users with desktop client software for theircomputers. The telecommuters can then dial a toll-free number or attachdirectly via a cable or DSL modem to reach the NAS and use their VPNclient software to access the corporate network and to access, downloadand execute the process software.

When using the site-to-site VPN, the process software is deployed,accessed and executed through the use of dedicated equipment andlarge-scale encryption that are used to connect a company's multiplefixed sites over a public network such as the Internet.

The process software is transported over the VPN via tunneling which isthe process of placing an entire packet within another packet andsending it over a network. The protocol of the outer packet isunderstood by the network and both points, called tunnel interfaces,where the packet enters and exits the network.

In FIGS. 10 and 11, Step 260 begins the Virtual Private Network (VPN)process. A determination is made to see if a VPN for remote access isrequired 261. If it is not required, then proceed to 262. If it isrequired, then determine if the remote access VPN exists 264.

If it does exist, then proceed to 265. Otherwise identify the thirdparty provider that will provide the secure, encrypted connectionsbetween the company's private network and the company's remote users276. The company's remote users are identified 277. The third partyprovider then sets up a network access server (NAS) 278 that allows theremote users to dial a toll free number or attach directly via a cableor DSL modem to access, download and install the desktop client softwarefor the remote-access VPN 279.

After the remote access VPN has been built or if it been previouslyinstalled, the remote users can then access the process software bydialing into the NAS or attaching directly via a cable or DSL modem intothe NAS 265. This allows entry into the corporate network where theprocess software is accessed 266. The process software is transported tothe remote user's desktop over the network via tunneling 273. That is,the process software is divided into packets and each packet includingthe data and protocol is placed within another packet 267. When theprocess software arrives at the remote user's desktop, it is removedfrom the packets, reconstituted and then is executed on the remote usersdesktop 268.

A determination is made to see if a VPN for site to site access isrequired 262. If it is not required, then proceed to exit the process263. Otherwise, determine if the site to site VPN exists 269. If it doesexist, then proceed to 272. Otherwise, install the dedicated equipmentrequired to establish a site to site VPN 270. Then build the large scaleencryption into the VPN 271.

After the site to site VPN has been built or if it had been previouslyestablished, the users access the process software via the VPN 272. Theprocess software is transported to the site users over the network viatunneling. That is, the process software is divided into packets andeach packet including the data and protocol is placed within anotherpacket 274. When the process software arrives at the remote user'sdesktop, it is removed from the packets, reconstituted and is executedon the site users desktop 275. Proceed to exit the process 263.

The terminology used herein is for the purpose of describing particularembodiments only and is not intended to be limiting of this disclosure.As used herein, the singular forms “a”, “an” and “the” are intended toinclude the plural forms as well, unless the context clearly indicatesotherwise. It will be further understood that the terms “comprises”and/or “comprising,” when used in this specification, specify thepresence of stated features, integers, steps, operations, elements,and/or components, but do not preclude the presence or addition of oneor more other features, integers, steps, operations, elements,components, and/or groups thereof.

The corresponding structures, materials, acts, and equivalents of allmeans or step plus function elements in the claims below are intended toinclude any structure, material, or act for performing the function incombination with other claimed elements as specifically claimed. Thedescriptions of the various embodiments of the present invention havebeen presented for purposes of illustration, but are not intended to beexhaustive or limited to the embodiments disclosed. Many modificationsand variations will be apparent to those of ordinary skill in the artwithout departing from the scope and spirit of the describedembodiments. The terminology used herein was chosen to best explain theprinciples of the embodiments, the practical application or technicalimprovement over technologies found in the marketplace, or to enableothers of ordinary skill in the art to understand the embodimentsdisclosed herein.

What is claimed is:
 1. A computerized device comprising: a processor;and a user interface operatively connected to said processor, said userinterface receiving a question comprising question terms, said processorautomatically searching sources of data containing passages to producecandidate answers to said question, said searching being based on saidquestion terms, and said searching identifying passages that supporteach of said candidate answers based on scoring features that indicatewhether said candidate answers are correct answers to said question,said processor automatically creating a scoring feature-specific matrixfor each scoring feature of said scoring features, each said scoringfeature-specific matrix specifying all different combinations of saidpassages, said candidate answers, and said question terms as vectors andcomprising score fields for score values for each specific question termwith respect to a specific passage and a specific candidate answer, andeach score field containing a score value corresponding to a vector andindicating how a passage term of said specific passage aligns with saidspecific question term to support said specific candidate answer asbeing a correct answer to said question; said processor automaticallycombining said vectors by calculating a statistical measure of saidvectors to produce a collapsed score for each of said question terms,said statistical measure comprising a collapsing function; saidprocessor automatically combining collapsed scores to produce a combinedscore for each of said candidate answers; and said processorautomatically ranking said candidate answers based on each said scorevalue.
 2. The computerized device according to claim 1, each of saidscoring features defining a different type of analysis for determiningwhether each of said candidate answers is correct.
 3. The computerizeddevice according to claim 1, said passages comprising text passages. 4.The computerized device according to claim 1, said collapsing functioncalculating one of a maximum, a minimum, a sum, a mean, a median, and astandard-deviation of said vectors.
 5. A computerized device comprising:a processor; and a user interface operatively connected to saidprocessor, said user interface receiving a question comprising questionterms into, said processor automatically searching sources of datacontaining passages using a processor of said computerized device toproduce candidate answers to said question, said searching being basedon said question terms, and said searching identifying sources ofevidence that support each of said candidate answers based on scoringfeatures that indicate whether said candidate answers are correctanswers to said question, said sources of evidence comprising passages,said processor automatically creating a scoring feature-specific matrixfor each scoring feature of said scoring features, each said scoringfeature-specific matrix specifying all different combinations of saidpassages, said candidate answers, and said question terms and comprisingscore fields for score values each specific question term with respectto a specific passage and a specific candidate answer, each score fieldcontaining a score value indicating how a passage term of said specificpassage aligns with said specific question term to support said specificcandidate answer as being a correct answer to said question, andmultiple ones of said different combinations of said passages, saidcandidate answers and said question terms forming vectors, saidprocessor automatically combining said vectors to produce a collapsedscore for each of said question terms, said processor automaticallycombining collapsed scores to produce a combined score for each of saidcandidate answers, and said processor automatically ranking saidcandidate answers based on said combined score.
 6. The computerizeddevice according to claim 5, each of said-scoring features defining adifferent type of analysis for determining whether each of saidcandidate answers is correct.
 7. The computerized device according toclaim 5, said passages comprising text passages.
 8. The computerizeddevice according to claim 6, said vectors being automatically combinedby calculating a statistical measure of said vectors to produce saidcollapsed score for each of said question terms, and said statisticalmeasure comprising a collapsing function.
 9. The computerized deviceaccording to claim 6, said vectors being automatically combined bycalculating a statistical measure of said vectors to produce saidcollapsed score for each of said question terms, and said statisticalmeasure comprising a collapsing function that calculates one of amaximum, a minimum, a sum, a mean, a median, and a standard-deviation ofsaid vectors.
 10. A computerized device comprising: a processor; and auser interface operatively connected to said processor, said userinterface receiving a question comprising question terms, said processorautomatically searching sources of data containing passages using aprocessor of said computerized device to produce candidate answers tosaid question, said searching being based on said question terms, andsaid searching identifying sources of evidence that support each of saidcandidate answers based on scoring features that indicate whether saidcandidate answers are correct answers to said question, said sources ofevidence comprising passages, said processor automatically creating ascoring feature-specific matrix for each scoring feature of said scoringfeatures, each said scoring feature-specific matrix specifying alldifferent combinations of said passages, said candidate answers, andsaid question terms and comprising score fields for score values foreach specific question term with respect to a specific passage and aspecific candidate answer, each score field containing a score valueindicating how a passage term of said specific passage aligns with saidspecific question term to support said specific candidate answer asbeing a correct answer to said question with respect to said scoringfeature, and multiple ones of said different combinations of saidpassages, said candidate answers, and said question terms formingvectors, said processor automatically combining said vectors bycalculating a statistical measure of said vectors to produce a collapsedscore for each of said question terms, said processor automaticallycombining collapsed scores for each of said question terms to produce acombined score for each of said candidate answers, and said processorautomatically ranking said candidate answers based on said combinedscore for each of said candidate answers.
 11. The computerized deviceaccording to claim 10, each of said scoring features defining adifferent type of analysis for determining whether each of saidcandidate answers is correct.
 12. The computerized device according toclaim 10, said passages comprising text passages.
 13. The computerizeddevice according to claim 11, said statistical measure comprising acollapsing function.
 14. The computerized device according to claim 11,said statistical measure comprising a collapsing function thatcalculates one of a maximum, a minimum, a sum, a mean, a median, and astandard-deviation of said vectors.
 15. A computer program product forautomatically answering questions, computer program product comprising anon-transitory computer readable storage medium storing program code,said program code being readable/executable by a computer to perform amethod comprising: receiving a question comprising question terms into auser interface of a computerized device; automatically searching sourcesof data containing passages to produce candidate answers to saidquestion, said searching being based on said question terms, and saidsearching identifying passages that support each of said candidateanswers based on scoring features that indicate whether said candidateanswers are correct answers to said question; automatically creating ascoring feature-specific matrix for each scoring feature of said scoringfeatures, each said scoring feature-specific matrix specifying alldifferent combinations of said passages, said candidate answers, andsaid question terms as vectors and comprising score fields for scorevalues for each specific question term with respect to a specificpassage and a specific candidate answer, each score field containing ascore value corresponding to a vector and indicating how a passage termof said specific passage aligns with said specific question term tosupport said specific candidate answer as being a correct answer to saidquestion; automatically combining said vectors by calculating astatistical measure of said vectors to produce a collapsed score foreach of said question terms, said statistical measure comprising acollapsing function; automatically combining collapsed scores to producea combined score for each of said candidate answers; and automaticallyranking said candidate answers based on each said score value.
 16. Thecomputer program product according to claim 15, each of said scoringfeatures defining a different type of analysis for determining whethereach of said candidate answers is correct.
 17. The computer programproduct according to claim 15, said passages comprising text passages.18. The computer program product according to claim 15, said collapsingfunction calculating one of a maximum, a minimum, a sum, a mean, amedian, and a standard-deviation of said vectors.